57 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English French Romanian
Availability:
Freely Available
License:
N/A
Size:
2,000,000 sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Emergent Communication Pretraining for Few-Shot Machine Translation
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yaoyiran Li | Europarl | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Basque English Finnish French Hungarian Romanian
Availability:
Freely Available
License:
MIT License
Size:
8130 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible
-
Paper track:Speech/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marcely Zanon Boito | MaSS dataset | /N |
Documentation:
Documentation in English at the github page
Written
Corpus,
Language Type:
Multilingual
Languages:
Czech Romanian Slovak Spanish Vietnamese
Availability:
Freely Available
License:
<Not Specified>
Size:
55 GByte Production Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:Diacritics Restoration Using Neural Networks
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Jakub Náplava | Charles University, Institute of Formal and Applied Linguistics | CZ | ||
| Author 2 | Milan Straka | Charles University | None | ||
| Author 3 | Pavel Straňák | Charles University in Prague | CZ | ||
| Author 4 | Jan Hajic | Charles University in Prague | CZ | Charles University | CZ |
| Main Contact | Jakub Náplava | Charles University, Institute of Formal and Applied Linguistics | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Bulgarian Czech English Modern Greek Romanian
Availability:
From Owner
License:
<Not Specified>
Size:
Approx. 6 GB unprocessed OtherProduction Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Bulgarian X-language Parallel Corpus
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Author 1 | Svetla Koeva | <Not Specified> | None | Institute of Bulgarian Language | None | DCL, IBL, BAS | BG | Department of Computational Linguistics, Institute for Bulgarian Language | None | Institute for Bulgarian Language | BG |
| Author 2 | Ivelina Stoyanova | <Not Specified> | None | ||||||||
| Author 3 | Rositsa Dekova | <Not Specified> | None | ||||||||
| Author 4 | Borislav Rizov | <Not Specified> | None | ||||||||
| Author 5 | Angel Genov | <Not Specified> | None | ||||||||
| Main Contact | Svetla Koeva | Institute for Bulgarian Language - Bulgarian Academy of Sciences | BG | Institute for Bulgarian Language, Bulgarian Academy of Sciences | BG |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Romanian
Availability:
From Owner
License:
GNU
Size:
3210/3210 Production Status:
Newly created-in progress
Use:
Transliteration
-
Paper title:Transliteration and alignment of parallel texts from Cyrillic to Latin
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Petic Mircea | Institute of Mathematics and Computer Science of Academy of Sciences of Moldova, Chișinău | MD |
| Author 2 | Daniela Gîfu | Alexandru Ioan Cuza University of Iaşi | RO |
| Main Contact | Petic Mircea | Institute of Mathematics and Computer Science of Academy of Sciences of Moldova, Chișinău | None |
Documentation:
EnglishLanguage Type:
Multilingual
Languages:
Romanian
Availability:
free for search
License:
-
Size:
100000000 words Production Status:
Newly created-in progress
Use:
Corpus specific uses
-
Paper title:CoRoLa – The Reference Corpus of Contemporary Romanian Language
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Verginica Barbu Mititelu | RACAI | RO | ||
| Author 2 | Elena Irimia | RACAI, Romanian Academy Bucharest | None | RACAI | RO |
| Author 3 | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Main Contact | Verginica Barbu Mititelu | RACAI | None |
Documentation:
-Language Type:
Multilingual
Languages:
Romanian
Availability:
From Data Center(s)
License:
ELRA
Size:
351 MByte Production Status:
Existing-used
Use:
Corpus Creation/Annotation
-
Paper title:The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Author 2 | Verginica Barbu Mititelu | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Author 3 | Elena Irimia | Research Institute for Artificial Intelligence | RO | Research Institute for Artificial Intelligence, Romanian Academy | RO |
| Author 4 | Ștefan Daniel Dumitrescu | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Author 5 | Tiberiu Boroș | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Main Contact | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | None |
Documentation:
yes, English, yes
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Romanian
Availability:
Not Available
License:
-
Size:
1257752812 tokens Production Status:
Newly created-finished
Use:
Acquisition
-
Paper title:The Reference Corpus of the Contemporary Romanian Language (CoRoLa)
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Verginica Barbu Mititelu | RACAI | RO | ||
| Author 2 | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | RO | ||
| Author 3 | Elena Irimia | Research Institute for Artificial Intelligence | RO | Research Institute for Artificial Intelligence, Romanian Academy | RO |
| Main Contact | Verginica Barbu Mititelu | RACAI | None |
Documentation:
yes, English and RomanianLanguage Type:
Monolingual
Languages:
Romanian
Availability:
Freely Available
License:
<Not Specified>
Size:
800 mio tokens, 152 hours of recording, GByte Production Status:
Newly created-finished
Use:
Parsing and Tagging
-
Paper title:A Bird’s-eye View of Language Processing Projects at the Romanian Academy
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | RO |
| Author 2 | Cristea Dan | Institute for Computer Science of the Romanian Academy | RO |
| Main Contact | Dan Tufiș | Research Institute for Artificial Intelligence, Romanian Academy | None |
Documentation:
Yes, in Romanian
Written
Corpus,
Language Type:
Multilingual
Languages:
Romanian
Availability:
Freely Available
License:
CreativeCommons
Size:
684379 tokens Production Status:
Newly created-finished
Use:
Discourse
-
Paper title:Humour and non-humour in religious discourse
-
Paper track:<Not Specified>
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Daniela Gifu | "Alexandru Ioan Cuza" University of Iasi, Faculty of Computer Science | RO |
| Author 2 | Liviu-Andrei Scutelnicu | „Alexandru Ioan Cuza” University, Faculty of Computer Science & Institute of Computer Science, Romanian Academy - Iaşi branch | None |
| Author 3 | Dan Cristea | „Alexandru Ioan Cuza” University, Faculty of Computer Science & Institute of Computer Science, Romanian Academy - Iaşi branch | None |
| Main Contact | Daniela Gifu | "Alexandru Ioan Cuza" University of Iasi, Faculty of Computer Science & Romanian Academy - Iasi branch, Institute of Computer Science | None |
Documentation:
<Not Specified>




